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Abstract. In 112) we obtained a functional central limit theorem 
(known also as a weak invariance principle) for sums of the form 
ES F(X(n), X(2n), ...,X(kn), X(q k+1 (n)), X(q k+2 {n)), ■ ■ ■ , X{q t (n))) 
(normalized by l/y/N) where X(n),n > is a sufficiently fast mixing vector 
process with some moment conditions and stationarity properties, F is a 
continuous function with polinomial growth and certain regularity properties 
and qi , i > m are positive functions taking on integer values on integers 
with some growth conditions which are satisfied, for instance, when g;'s are 
polynomials of growing degrees. This paper deals with strong invariance 
principles (known also as strong approximation theorems) for such sums which 
provide their uniform in time almost sure approximation by processes built 
out of Brownian motions with error terms growing slower than \Z~N. This 
yields, in particular, an invariance principle in the law of iterated algorithm 
for the above sums. Among motivations for such results are their applications 
to multiple recurrence for stochastic processes and dynamical systems as well, 
as to some questions in metric number theory and they can be considered as 
a natural follow up of a series of papers dealing with nonconventional ergodic 
averages. 



1. Introduction 

Nonconventional ergodic theorems attracted substantial attention in ergodic the- 
ory (see, for instance, [T] and [5]). From a probabilistic point of view ergodic 
theorems are laws of large numbers for stationary processes and once they are es- 
tablished it is natural to study deviations from the average. The most celebrated 
result of this kind is the central limit theorem. In [12] we obtained a functional 
limit theorem for expressions of the form 

(1.1) £ N (t) = l/VN ]T (F(X( qi (n)),...,X(q e (n)))-F) 

l<n<Nt 

and for the corresponding continuous time expressions of the form 

l-Nt 

(1.2) i N {t) = l/VN (F(X(q 1 (t)) ) ...,X(q e (t)))-F)dt 

Jo 
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where X(n),n > O's is a sufficiently fast mixing vector valued process with some 
moment conditions and stationarity properties, F is a continuous function with 
polinomial growth and certain regularity properties, F = J Fd{ii x • • • x /z), /i is 
the distribution of X(0), qj(t) = jt, j < k and q^j > k are positive functions 
taking on integer values on integers in the discrete time case with some growth 
conditions which are satisfied, for instance, when q^s are polynomials of growing 
degrees though we actully need much less. A substantially more restricted central 
limit theorem for expressions of this sort was obtained in [TT] . 

Functional central limit theorems are called nowadays also weak invariance prin- 
ciples while for more than 40 years now (since probably Strassen's work [16]) proba- 
bilists were interested also in strong invariance principles called also strong approx- 
imation theorems. The latter provides almost sure or in average approximation 
of a sum of N random variables by a Brownian motion or, more generally, by a 
Gaussian process with an error term growing slower than y/N which yields as a 
result both the central limit theorem and the law of iterated logarithm, as well as 
other limiting results which are clear or easy to prove for Gaussian processes. 

We will show in this paper that the sums S(iVi) — y/N^N^t) appearing in 
can be represented as J2i<i<e ^i(Nt) where each Hj(JVt) can be approximated with 
an error term of order N^~~ a , a > by a process <JiBi(t) where cr, > is a constant 
and Bi is a Brownian motion. This result yields also a law of iterated logarithm 
type result saying that with probability one all limit points as N — > oo of the 
sequence ^(i) (log log iV) -1 / 2 , t G [0, 1] of paths belong to a compact set. 

Our methods employ the martingale approximation machinery from [12] , en- 
hanced so that to obtain appropriate error estimates, together with the technique 
from [14) which involves partition into blocks and Skorokhod embedding of martin- 
gales into a Brownian motion (the latter was first used for similar purposes in [16] ) . 
Observe that the summands in (jl.l[) depend strongly on the future and martingale 
methods start working only after we force "the future to become present". By this 
reason the role of martingales in our nonconventional framework was not selfevi- 
dent at the beginning but their effective use initiated in |12) opened a wide vista for 
proving various limit theorems in this setup. It was shown in [12j that £at converges 
weakly to a Gaussian process and it would be interesting to obtain a strong approx- 
imation of \/N^N{t) by such Gaussian process but this would require to deal with 
multi dimensional approximations where the Skorokhod embedding we rely on does 
not work. Observe that since the 1960ies several other methods were developed to 
provide approximation of sums of random variables by a Brownian motion. Among 
them is the quantile method (see, for instance, |13j ) which provides essentially op- 
timal approximation but works only for independent random variables and by this 
reason does not seem applicable to our setup. Another method developed by Stein 
(see its recent account in [5]) also yields nearly optimal error estimates but it is 
not yet clear whether it can be adapted to our situation. The advantages of yet 
another method based on estimates of conditional characteristic functions (see, for 
instance, [3]) lie in its applicability to the multidimensional situation where, for 
instance, the Skorokhod embedding does not work well, but complications in the 
use of characteristic functions exhibited in |llj make applicability of this method 
in our setup doubtful. 

As in [12] our results hold true when, for instance, X(n) — T n f where 
/ = (/i, f p ), T is a mixing subshift of finite type, a hyperbolic diffeomorphism 
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or an expanding transformation taken with a Gibbs invariant measure (see for 
instance, [2]) and some other dynamical systems, as well, as in the case when 
X(n) — f(£ n ), / = (/l> •••>/») where is a Markov chain satisfying the Doeblin 
condition (see |10j ) considered as a stationary process with respect to its invariant 
measure. The main known application of the above type results is to multiple re- 
currence when we employ our limit theorems for the random variable which counts 
returns of the stochastic process under consideration to given sets. In this case the 
function F above is a product of some coordinate functions in which we plug in 
corresponding X{qi{n)) = 1^. (77(^(71))) where 77(771) is either T m x in the dynamical 
systems case or £ m in the Markov chain case and 1a is the indicator of a set A. 
This yields also applications to metric number theory providing limit theorems, for 
instance, for the number Mn (x) of times first N digits in the m-base or continued 
fraction expansion of x belong to a chosen subset of digits. As it is well known the 
former expansions can be obtained via the multiplication by m (expanding) trans- 
formation while the latter via the Gauss map of the interval and both dynamical 
systems are exponentially fast ip mixing with respect to their invariant Lebesgue or 
Gauss measure, respectively (see, for instance, [8]). 

2. Preliminaries and main results 

Our setup consists of a p-dimensional stochastic process {X(n), 71 = 0,1,...} on a 
probability space (SI, F, P) and of a family of c-algebras Fm C F, —00 < k < I < 00 
such that Fki C Fk'i' ii k' < k and V > I. The dependence between two sub o- 
algebras G,H C F is measured often via the quantities 

(2.1) izi qt p(Q = sup{\\E[g\Q] — E[g]\\ p : g is % — measurable and \\g\\ q < 1}, 

where the supremum is taken over real functions and || -\\ r is the L r (fl,F, P)-norm. 
Then more familiar a, p, cf> and ^-mixing (dependence) coefficients can be expressed 
in the form (see [3], Ch. 4 ), 

a{G,U) = ±^oc,i(£,H), p(G,H) = m 2a {g,U) 

</>(<?,%) = ^Wo^oo^H) and il)(G,W) = zi7i, 00 (£,%)■ 

The relevant quantities in our setup are 

(2.2) w qtP {n) — supro (?i p(J r _ 00ife , Fk+n.oo) 

k>0 

and accordingly 

a(n) = ^oo^n), p{n) = n7 2 , 2 (n), <p(n) = -^^(n) and ip(n) = n7 lj00 (n). 

Our assumptions will require certain speed of decay as n — > 00 of both the mixing 
rates zu q , p (n) and the approximation rates defined by 

(2.3) P p (n) = sup \\X(m) - E(X{m)\ F m - n , m+n ) \\ p . 

m>0 

In what follows we can always extend the definitions of Fm given only for fc, I > 
to negative k by defining Fki = Fqi for k < and I > 0. Furthermore, we do not 
require stationarity of the process X(n),n > assuming only that the distribution 
of X(n) does not depend on n and the joint distribution of {X(n), X(n')} depends 
only on n — n' which we write for further references by 

(2.4) X(n) ~ p and (X(n),X(n')) ~ /_«„_„, for all n, n' 
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where Y ~ Z means that Y and Z have the same distribution. 

Next, let F = F(xi, xg), Xj € K p be a function on ]R p£ such that for some 
l,K > 0, k G (0, 1] and all x i ,y l G = 1, 

(2.5) ..,^) - W )| < if (l + ]T N' + E E 1^ " fil" 

3=1 3=1 3=1 

and 

t 

(2.6) <^(i+Ei^-r)- 

3=1 

The above assumptions are motivated by the desire to include, for instance, func- 
tions F polinomially dependent on their arguments. To simplify formulas we assume 
a centering condition 

(2.7) F = J F(x 1 ,...,x e )dfi(x 1 )---d t i{x e ) = 

which is not really a restriction since we always can replace F by F — F . 

Our setup includes also a sequence of increasing functions q\ (n) < <?2 {n) < ■ ■ ■ < 
qt(n) taking on integer values on integers and such that the first k of them are 
qj(n) — jn, j < k whereas the remaining ones grow faster in n. We assume that 
for k + 1 < i < £, 

(2.8) «(n+l) -*(n)> n 5 

for some 6 > and all n > 2 while for i > k and any e > 0, 

(2.9) limnu% i+ i(en) - ?i (n)) > 

n—¥oo 

which is equivalent in view of (|2.8[) to 

(2.10) liminf(g i+ i(en) - qdn)) = oo. 

In order to give a detailed statement of our main result as well as for its proof 
it will be essential to represent the function F = F(xi,X2, ■ ■ ■ , xg) in the form 

(2.11) F = F 1 {x 1 ) + ■■■+ F e (x u x 2 , ...,x t ) 
where for i < £, 

(2.12) Fi(xi, . . . ,Xi) = J F(x!,X2,...,Xi) dfi(xi+i) ' ■ ■ d(j,(xi) 

- J F(xx,x 2 , ...,£*) dpb{xi) ■ ■ ■ dfj,(xe) 

and 

Fi(xi,X2,...,xe) = F(xx,x 2 ,...,xe) - J F(xx,x 2 , ■ . ■ ,x t ) dfi(xt) 
which ensures, in particular, that 



(2.13) / Fi(xi,x 2 , ■ ■ ■ ,Xi_i,Xi) dfi(xi) = V xi,x 2 , . . . 
These enable us to write 

£ 

(2.14) H(t)=£Si(t) 

i=i 
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where for 1 < i < I, 

(2.15) Ei(t)= F i (X(q 1 (n)),...,X(q i (n))), 

l<n<t 

The decomposition of E(t) above is different from [12] since we work here with each 
Sj (t) separately remaining all the time within a one dimensional framework and do 
not care about multi dimensional covariances. 
For each 9 > set 

(2.16) 1 e e = \\X\\ e g = E\X(n)\ e = J 
Our main result relies on 

2.1. Assumption. With d — {I — l)p there exist p, q > 1 and 5, m > with 
S < k — | satisfying 

oo 

(2.17) n & vo q p {n) < oo, 

n=0 

oo 

(2.18) 5>j8,(r)) 4 < °°> 

r=0 

1 1 i+2 S 

(2.19) 7m < OO ,l2qU+2) < OO With —— > 1 h -. 

v ' 2 + o p m q 

Following [T3] we will write Z(t) <C a(t) a.s. for a family of random variables 
Z(t),t > and a positive function a(t),t > if limsup^^ \Z(t)/a(t)\ < oo almost 
surely (a.s.) 

2.2. Theorem. Suppose that A s sumption 1 2. 1\ holds true. Then without changing 
their (own but may be not joint) distributions the processes 3i(t), t > 0, i = 1, t 
can be redefined on a richer probability space where there exist also standard Brow- 
nian motions Bi{t),t > 0, i = 1, ...,£ such that for some constants a > and 
o~i > 0, i = 1, ...,£, 

(2.20) Ei(t) - <TiBi(t) < i* _a a.s.. 

As usual (see [14] and [S]), relying on the well known invariance principle in the 
law of iterated logarithm for the Brownian motion (see |15| ) we obtain immediately 
from the above theorem the following result. 

2.3. Corollary. Let Ki be the compact set of absolutely continuous functions x in 
C[0,1] with x(Q) = and J* x 2 (u)du < of and set C M (w) = (2tlnlnt)- 1 / 2 E l {tu), 
u € [0, 1]. Then the family Qj, t > 3 is relatively compact in the topology of uniform 
convergence and as t — > oo i/ie sei o/ all a.s. limit points of coincides with Ki. 
Let K be the compact set of functions x € C[0, 1] which can be written in the form 
x(u) = X)i<i<£ x i( u ) w ith x i € i^i, i = 1, Set Ct( u ) = (2i mlnt) _1 ' 2 H(iu), 
w € [0, 1]. TTien £/ie family t > 3 is relatively compact in the topology of uniform 
convergence and as t — > oo i/ie set o/ a/Z a.s. Zwnii points of Q is contained in K. 

In order to understand our assumptions observe that tn^p is non-increasing in q 
and non-decreasing in p. Hence, for any pair p,q> 1, 
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Furthermore, by the real version of the Ricsz-Thorin interpolation theorem (see, 
for instance, [7], Section 9.3) if 9 <E [0, 1], 1 < po,pi, qo, Qi < oo and 

1 _ 1- 9 9 11-9 9 

p Po Pi 1 q <?o «?i 

then 

™ q , P ( n ) ^ 2 ( ro go,Po(«)) 1_e ( ra7 «i,PiW) 9 - 
Since, clearly, zu qi , Pl < 2 for any q± > pi it follows for pairs (oo, 1), (2,2) and 
(oo, oo) that for all q > p > 1, 

Wq,p{ n ) < (2a(n))f ~«, xn q>p (n) < 2 1+ p ~ i (p(n)) 1- p + 5 
and ro g ,p(n) < 2 1+ p(0(n)) 1_ p. 
We observe also that by the Holder inequality for q > p > 1 and a € (0,p/q), 

f3(q,r) <2 1 -"[/3(p,r)]" 7 ^°_ Q) 

with 7g defined in (|2.16|) . Thus, we can formulate Assumption 12. II in terms of more 
familiar a, p, </>, and -0-mixing coefficients and with various moment conditions. 

The strategy of the proof of Theorem 2.2 consists of several steps. First, we 
split the sum Ej(t) into a sum of "big" and "small" growing blocks so that the 
total contribution of small block can be disregarded and their sole purpose is to 
provide sufficient separation between big blocks. Growing blocks will enable us to 
approximate their members by conditional expectations as in (|2.3[) with increasing 
precision which differs from |12j and is an important point in obtaining our esti- 
mates. In spite of the fact that big blocks still remain strongly dependent in our 
setup the technique of |12) enables us to treat them as if they were weakly depen- 
dent. Namely, employing appropriate estimates from |12j we construct a martingale 
approximation of sums of big blocks with an error sufficient for our purposes. Fi- 
nally, we rely on the Skorokhod embedding of martingales into a Brownian motion 
and estimate the distance between the embedded process and the Brownian motion. 

We observe that though the Skorokhod embedding preserves distribution of each 
one dimensional martingale it does not preserve, in general, joint distributions of 
several martingales when we employ it simultaneously to I of them as in our case. By 
this reason we obtain strong approximations (|2.20j) for each Ej (t) but we do not ob- 
tain a strong approximation of the sum E(t) by the Gaussian process 2i=i a iBi(t) 
which according to [12] is the weak limit of processes N~ 1 / 2 'E(Nt) as N —> oo. In 
fact, this is connected with multidimensional strong approximation theorems where 
the Skorokhod embedding is not applicable while other methods employed usually 
in these circumstances do not seem to work for in our nonconventional setup. 

3. BLOCKS AND MARTINGALE APPROXIMATION 

The following result which is a part of Corollary 3.6 from |12j (improving in 
several respects Lemma 3.1 from [11]) will be a base for our estimates. 

3.1. Proposition. Let Q andT-L be a-subalgebras on a probability space (fi, J 7 , P), 
X and Y be d- dimensional random vectors and f = f(x,u>), x € K d be a collection 
of random variables measurable with respect to H and satisfying 

(3.1) \\f(x,cj)-f(y,u;)\\ q < Cl (l + \x\ l + \y\ l )\x-y\ K and \\f(x,u)\\ q < C 2 (l + \x[) 
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where g > 1. Set g{x) — Ef(x,ui). Then 

(3.2) \\E(f(X, -)\Q) - g{X)\\ v < c(l + ||X||&%})to lP (0, W) + ||X - 

provided ^ > ^ + ^ + | and k — | > <5 > loit/i c = c(Ci, C2, i, 1', k, 5,p, q, v, d) > 
depending only on parameters in brackets. Moreover, let x — (v, z) and X = (V, Z), 
where V and Z are d\ and d — d\- dimensional random vectors, respectively, and let 
f(x,w) — f(v,z,uj) satisfy \3.1\) in x — (v,z). Set g(v) — Ef(v, Z{u)),u). Then 

(3.3) \\E(f(V, Z, -)\g) - g(V)\\ v < c(l + ||X||^ 2 +2) ) 

x(w q , p (g,n) + \\v- E(v\g)\\ s q + \\z- E(z\u)\\ s q ). 

We will use the following notations 

(3.4) F it r >n (xi,X2, ■ . .,Xi-i,uj) = E(F(xi,x 2 , ■ ■ . ,Xi-i, X{n))\F n - r , n+r ) , 
X r {n) = E(X(n)\F n _ r>n+r ), Y^n)) = F t (X( qi (n)), . . . , X(q t (n))) and 

^(i)=0 if j^qi(n) for any n, Y^ r (q, t {n)) = Fi^ q .( n )(X r (qi{n)), 

...,X r (qi^x(n)),u) and Y ii7 .(j) = if j ^ qi (n) for any n. 

Next, we fix some positive numbers An < 29 < t < 1/2 which will be specified 
later on and following [T3] introduce pairs of "big" and "small" increasing blocks 
defining for each i random variables Vi (j) and W%{j") inductively so that 

(3.5) ^(1) = K i4 fe(l)), Wi(l) = F <il ( % (2)), a(l) = 0, b(l) = 1 and for j > 1, 

a(j) = b(j - 1) + [(j - 1)% Hj) = a(j) + r(j) = [j% 

Mi) = Ea(i)<i<6(i) *V(j)(ft(0) and Wi(j) = Eb(j)<l<a(j + 1) Yi,r(j)(qi(l))- 

Observe that unlike [T^] but following P3] the parameter r(j) grows with j in- 
creasing precision of conditional expectations approximations. Let Vi{t) = max{j : 
Hj) + [j 6 ] — t + 1} which is the number of full small blocks in the sum Sj(t). We 
will see that the small blocks Wi(j), j — 1,2,... make negligible contributions to 
the sum Sj and can be disregarded while the big blocks Vi(j), j = 1, 2, ... are widely 
separated which enables us to exploit fully our mixing assumptions. Observe that 
unlike the sums appearing in standard limit theorems these big blocks are strongly 
(and not weakly) dependent but as in [12] we will see by means of Proposition 13.11 
that only sufficient separation between qi(l) for different Vs plays the role. 
Next, set 

00 

(3.6) Ri(m)= E(Vi{j)\g m ) 

j=m-\-l 

and M;(m) = V<(m) + Ri{m) - R^m - 1) where Q m = J r - o,g I (6(m))+r(m)- 
Observe that if o(j) < I < b(j) and j > m + 1 then X = 
(^rO)(9i(0)>-:^r(j)(%-i(0)) is F-oo, qi -i_(l)+r{i) measurable while f{x,w) = 
Fi,r{j),qi{i)(. x ii •••) x i-ii w ) is ^qdn-rfj)^ measurable. Hence, by (|3.2I) considered 

With Q = ^ 7 - oc .max( 9i _ 1 (i)+r(i),g,(b(m))+r(m)) an d H = -^.(i)-' r(i),oo we obtain that 

(3-7) \\E(Y iMj) ( qi (l))\g m )\\ 2+s < Cw g>p (d itj (l)) 

where p and q satisfy conditions of Proposition 13 . 1 1 with v — 2 + S and Assumption 
12 . H C > does not depend on i,j,l,m and 

(3.8) d itj (l) = mm( qi {l) ~ q t ^(l) - 2r(j), qi (l) - *(6(m)) - r(j) - r(m)) 
> / - b(m) - 2r(j) > a(j) - b(m) - 2r{j) 
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taking into account that under our assumptions 

(3.9) q l {l) - qi-x{l) > I and q t (l) - qi{m) >l — m 
provided I is large enough. Thus, for j > m + 1, 

(3.10) ||£(VS0')M| 2+ff < C E ^ q ,pQ-Km)-2r(j)) 

a{j)<l<b{j) 

and 

oo 

(3.11) \\Ri{m)\\ 2+ 8<C E - Km) - 2r(j')) < ^ < oo 

j=m+l a(j)<l<b(j) 

for some constant C* > 0. In particular, the series Q3.6P converges in L 2+ (f2, J 7 , P), 
and so the definition of Ri(m) makes sense. Observe also that M^(m) is Q m mea- 
surable and 

E(Mi(m)\g m -i) = E(Vi(m) + Ri(m)\g m -t) - R t (m - 1) = 
which means that (Mi{rn),Q m ), m= 1,2, ... is a martingale difference sequence. 

We are going to replace the sum 3*(t) by the martingale X)i<m<i/ i (t) ^i( m ) aim 
it will be crucial for our purposes to estimate the corresponding error. In order to 
make the first step in this direction we set 

h(m)= ]T (Vi(j) - Mi(j)) 

l<j<m 

and relying on p. lip it follows that 

(3.12) ||M™)|| 2 = \\Ri(yi(™))h + \\Ri(P)h < 2(5 

for some constant C > 0. By Chebyshev's inequality 

(3.13) P{\h{m)\ > ^m? +£ } < l&C 2 m- {1+2e \ 
Observe that 

it] > E [j t ] ^ /" ,{ * uTdu = 0- + ^M*)) 1 

l<J<f<(t) 

and so 

(3.14) ^(i)< ((l + r)[i]) 1/1+r <2[<] 1 / 1 +^. 

Hence, taking m = Vi([t]) — Vi{t) and e — \{\t — 0) + \t{\ — 9) > we obtain by 
(l3~T3l) and (j3~T4| that 

(3.15) > < P{|/i(^-([t]))| 
> (WM))' +£ } < 16(7 2 (^([i]))-( 1+2£ ). 

Therefore, by the Borel-Cantelli lemma we conclude that for some Uq = no(u>) and 
all Vi(t) > n , 

(3.16) \lMt))\<^ (1 - e) a.s. 
Observe that 

t<2 2J b' T ] < / (w T + l)du < 4t~ 1 (^W + i) 1 



\1+T 
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and so 

(3.17) Vi (t) > (rt/2) 1/1+T - 1. 

Hence, if t > 2r" 1 (no + l) 1+r then (|37L6| holds true. 
Next, set 

h{m)= e i*s(*(0)i. 

a(m+l)<l<a(m+2) 

Since a{vi(t) + 1) < t < a(vi(t) + 2) then 

(3.18) | Y M l ))\ < 

a(^(t))<(<t 

By and (12~T31) . 

(3.19) ||^(*(0)lla+* < C7 < oo 

for some C > independent oH and since by the construction a(m + 2) — a(m+l) = 
(m + l) r + (m + l) e we see that 

(3.20) ||J 2 (m)||a+* < E ll y i(«(0)lla+« < 2C(m + l) r . 

o(m+l)<i<o(m+2) 

By (I3.14j) and Chebyshev's inequality 

(3.21) P{|/ 2 (^W)| > 

< P{|/ 2 (^))| > (^(t))*^^} < C^(i))- 1 ^ 

for some (7 > independent of i where we assume that r < 4 min(<5, 1) and take 
e = ^ min(<5, 1), /3 = min(<5, 1). As in (13.16[) we conclude using (|3.1T[) and the 
Borel-Cantelli lemma that 

(3.22) h{vi{t)) < t^- s) a.s. 

for all t > to and some random variable to = to((d) < oo. 

Next, we estimate contribution of the small blocks. Let I > j then Wi(j) 
is measurable with respect to Q = J--oo.q t (a(j+i))+r(j) provided j v > 2, and so 
applying ([3"T2"|1 with such G, /(xi, Xi-i, u) = F i>r ^ qi ^( xi, ... ,Xj-i,u>) where 
b(l) < n < a(l + 1), H = FqiQ>(!))-r(!),oo we obtain by (|23)l . (|2~TO1) and (|33]) that 
for n large enough, 

(3.23) \EWi(j)Y iHl) ( qi (n))\ = \E(W t (j)E(Y lHl) (q z (n))\g))\ 

< CitJ7 gjP (n - r(l) - a(j + 1) - r(j))||T^(j)|| 2 

for some Ci > independent of j, n, I satisfying the conditions above. Since by 
(pro) , (pm?!) and the definition of blocks, 

(3.24) \\Wi(j)h< E m,rU)(qS))h < C2IA 

b(j)<l<a(j + l) 

for some C2 > independent of j, then 

(3.25) \E(WiV)Wi{l))\ < C 1 C 2 [7 9 ][/V ?lP ( E M " 2 KD)- 

_7<m<Z 
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Hence, by (|2.17[) . (|3.24p and (|3.24j) for any positive integers m < n, 

(3-26) E(z m<l<n wmf < T, m <i<n (Km + 2E m<i<i mw^mm) 

for some C 3 , C 4 > independent of m and ra. It follows by Theorem Al from [T3] 
together with ([3"J4]) that 

(3.27) | w i{j)\^«t))i +e \og 3 Ui(t)<2^- E a.s. 

\<m<Ui(t) 

where e < (|t — 0)(1 + t) _1 and i is large enough. 
Next, set 

h(m) = \ E (^fe(0)-^rO-)fe(0))| 

l<3*<mo(3')<K«y+l) 

By (US]), ([23]), ([2TT51) and Holder's inequality (see Lemma 4.11 in [H]), 

(3-28) \\Yi(qi(l)) - Y lH3) { qi (l))h < Cf3 S q (r(l)) 

for some q, S > satisfying (|2.19|) and for a constant C > independent of j. 
Hence, by (j2~T^l) . 

(3.29) ll^(^(t)|| 2 < C< oo 

for some constant C > independent of i. Proceeding in the same way as in ()3.16p 
we obtain that for some random variable to = to(to), 

(3.30) \h(vi(t))\ <^ (1 - e) a.s. 

whenever t > t . Finally, collecting (|3T6l) . (|3T22|) . (|3~27) and (|3~30)) we conclude 
that 

(3.31) |Si(t)- J] ^(i)l«* l_£ 

i<j<^(t) 

for some e > 0. 

4. Completing the proof via Skorokhod embedding 

A martingale version of the Skorokhod embedding (representation) theorem (see 
|16) . Theorem 4.3 and [9], Theorem Al) applied to our martingale Mi(m) = 
Xa<i<m yields that if {Bi(t), t > 0} is a standard Brownian motion then 
there exist non-negative random variables Tj = Tij such that the processes 

(4.1) {Bi{ 7)), m>l} {M,(m),m>l} 

l<j<m 

have the same distributions. Hence, without loss of generality we can redefine 
{Mi(j), j > 1} by 

(4.2) M i (m) = B i ( J2 T i)- B i( E T ^ 

l<j<m l<j<m— 1 

and can keep the same notations for both Mi(m) and Mi(m). In fact, we will 
redefine also the processes X(n), Vi(m), Wi(m) we had before on a richer and 
common with Mi(m) probability space so that all marginal and joint distributions 
remain intact. Furthermore, the embedding theorem cited above yields that if 
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A m is the cr-algebra generated by {Bi(t) 7 < t < Ei<i<m^i} tnen ^™ * s 
measurable, Bi(J2i<j< m Tj + s) - Bi(^2i<j<. m Tj) is independent of A m for any 
s > 0, 

(4.3) E(T m \Am-i) = £(M?(m)|An-i) = £(M?(m)|S m _i) - £7(M?Ml6n-i) 
and 

(4.4) E(T m \Am-i) < c u E(\Mi(m)\ 2u \A m -x) 

where c u > depends only on«> 1, A m D 0m 3 Q m = a{Mi(j), 1 < j < m} and 
t/ m is the same as in (|3.6p . 

In order to exploit the representation 

(4.5) Mi(m) = Bt( T i) 

l<j<m 

we have to establish a strong law of large numbers with appropriate error estimates 
for sums of Tj 's in the form 

(4.6) | ^ Tj - aft\ = 0(t 1 - A ) a.s. 

l<t<Ui{t) 

for some A > and <7,: > 0. This would imply that 

(4.7) \Bi( J2 T i)- 5 i( CT !*)| a.s. 

i<j<»i(t) 

for some A < |A. Indeed, set Tj(t) = Ei<j<i/ f (t) ^i' Then (|4.6[) means that 
|r,(t) — off] < Qt 1 ~ x for some random variable Q = Q(w) < oo a.s. Introducing 
the events fijv = {Q < -^V} we obtain 

A(t) = {BifaQ)) - Bi(oft)|Io w < + A 2 {t) + A 3 (i) 

where 

Ai(t) = su Po < s < wtl - A |Bj(of* + s) - Bi(aft)\, 
A 2 {t) = \Bi(aft) - Bi(aft - Nt 1 -*)] and 
A 3 (t) = sup < s < wtl _ A l^^ft - Nt 1 ^ + s)- B t (aft - Nt 1 ^)]. 
By the martingale moment inequalities for the Brownian motion 

EA 2m (t) < C m N rn t m{1 - x \ j = 1,2,3 
where C m > depends only on m > 1. Thus 

P{A(n) > n^ 1 } < 3 2m ~ 1 C m N m n~ rn( - x - 2 ~ x l 

Choose A < ~A and m > 2(A - 2A) _1 then n - m (^- 2 ^) < n - 2 ^ an d so the proba- 
bilities above form a converging series. Hence, by the Borel-Cantelli lemma there 
exists no = no(uj) < oo such that 

A(n) < n 5 ~ A a.s. for all n > no. 

Since and so also T,(i), can change only at integer t and since Qjy j" f2 as 

TV t oo with P(fi) = 1 we conclude that, indeed, (|4.6|) implies (|4.7|) . Finally, 
redefining without changing distributions all processes once again we can replace 
Bi(a 2 t) by aiBi(t) arriving at the assertion of Theorem 12.21 
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We start deriving (|4.6|) by writing 

(4.8) J2 & - M i^)) = ~ ^V), 
where 

D«(m) = (Tj-E&lAj-!)), D^(m)= ]T {Mf{j)-E{M^{j)\Qj-x)l 

l<j<m 1 < j < rn 

and using (JO) in order to have g^]). Set i? (1) (j) = 7) - E { T j\^j-i) then 
(R^(j), Aj)j>\ is a martingale differences sequence. By p. lip . (|3.19p and (14. 4[) 
for any j > 1, 

£|Jj(i>(,-)|i+*« < 2£|T,| 1+ * 5 < 2c 1+ i,£|M l (j)| 2+5 < C(l+£;|V-(j)| 2 + 5 ) < c/ 2 +^ 

for some C,C > independent of j. Observe that (j~^ 1+T+e ' > R^\j), Aj)j>i is also 
a martingale differences sequence and assume that r < 8/ A and t + e < 1/4. Then 

oo oo 

and so by the standard result on martingale series (see Theorem 2.17 in [9]), 
^ j _ ' 1+r_£ 'i? < ' 1 '(j) converges a.s. 

l<j<oo 

Hence, by Kronecker's lemma 

rn 

m -^ +T -^J2 R(1) U)= m ' (1+T ' e)D(1) ( m )^° a - s - asrrwoo, 

3=1 

and so by ([3~TI)) . 

(4.9) r^-^JlD^^ift))! < 4(i/(t))-( 1+T - £ )|^( 1 )(^(t))| -> a.s. as t -> oo. 

Setting RW(j) = M 2 (j) - E(M?(j)\Gj-i) we obtain that (R^{j), Gj) 3 >i is a 
martingale differences sequence, as well, and by (|3.11[) and (I3.19[) . 

E\R {2 \j)\ 1+ ^ < 2E\M l {j)\ 2+s < C(l + E\V t {j)\ 2+s ) < Cj [2+5)t 

for some C, C > independent of j. Thus, in the same way as above, we see that 

(4.10) ^ (1_ tM|.EK 2 ) (!/;(*)) | -» a.s. as t -)• oo. 

It follows from (|4.8p - (|4.10[) that in order to obtain (|4.6p it suffices to show that 
there exists cr^ > such that 

(4.11) | M?(j)-o?t\=0(t 1 - x ) a.s. 

l<3<Ui(t) 

for some A > 0. By the definition of Mi(J) and the Cauchy inequality, 

(4.12) | J2 (M 2 { ] )-V 2 ( 3 ))\<{Mm)f/ 2 {2{ ]T V 2 (j)) 1/2 + (Ai(m))^ 2 ) 

l<j<m 1 <J <rri 

where ^(m) = Ei<;<™ Pi 0'). PiC?) = ^0') ~ ^(i " 1) and ^I^M| < mC 2 by 
(|3.1ip . Fix j3 > and for each Z > 1 set m; = [7 2 /^] then by Chebyshev's inequality 

(4.13) filA^m)] > m\ +fi } < C 2 m^ < CV 2 
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for some C > independent of I. Therefore, by the Borel-Cantelli lemma for all 
I > Iq = lo(u) < oo, 

\Ai(mi)\ < m] +p a.s. 
If mi < Vi(t) < mi + i and I > Iq then by (|3.14[) . 

\Ai{vm\ < \Mmi+i)\ < m, x +f < ( Vi (jt))^C^±) l +P < Ot$& a.s. 

+ mi 

where C > does not depend on I. Choosing /3 = r/2 we obtain that 

(4.14) M v i(t)) ^ C^ 12 ^ a.s. 

for all t > to = to(u}) < oo where in view of p. 171) we can take to — -((Iq + l) 2 ^ + 
2) 1+r . 

It follows from (|4.12l) and (I4.14[) that in order to obtain (|4. 1 1[) it remains to show 
that 

(4.15) | V?{ 3 )~o 2 t\=0{t^) a.s. 

i<i<«*(t) 

for some A > 0. Next, we will make yet another reduction showing that (14. 15|) will 
follow if 

(4.16) |( Y i (q i (l))) 2 -a?t\=0(t 1 - x ) a.s. 

i<j<t 

for some A > 0. A transition from (14.161) to f|4. 1 5[) proceeds in the same way 
as in Lemma 7.3.5 of [14] but for readers' convenience we sketch also here the 
corresponding argument. 
First, we write 

(4.17) \E( ]T Y^l))) 2 - V i®\ ^ Mt) + Mui(t)) 

l<l<t l<3<Vi(t) 

where 

J 1 (t) = \E(J2Y i (q i (l))) 2 - EV ?U)\ 
\<i<t i<j<f<(t) 

and 

J 2 (m) = | {V?(j)-EV?V))\. 

1 <7<rn 

Next, 

(4.18) Ji(<) < J n (ui(t)) + Jia(fi(*)) + JisiM*)) + Ju(vi(t)) + JisiyiQ)) 
where 

Ju(m) = 2£ 1 < j< j< m I^CjJVSG)!, Ji 2 (m) = B(Ei<i< m TO)) 2 , 

J 13 (m) = S(/ 2 (m)) 2 , J 14 (m) = E(I 3 (m)) 2 and 

Ji 5 (m) = 2U E 1 < j < ro Vi(j)h(jU\m) + J'uim) + j\L\m)) 

with I 2 and I3 the same as in (|3.20|) and (|3.29|) , respectively. Using (|3.7p - (|3.10l) we 
obtain similarly to (|3T23]> — (|3T26]> that 

(4.19) J n (m) < 2CEi< J<J < m ll^(^ll2E a J)</< b G)^,Pa- & (j)-2rj)) 
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for some C, C > independent of m. For Ji2(m), Ji3(m) and Ji2{m) we already 
have appropriate estimates in (|3.26|1 . (|3.20|) and (|3.29p . respectively. Employing 
(|3.2j) from Proposition I3H] together with Assumption 12. II in order to estimate a;„ = 
\EYi(qi(l))Yi(qi(n))\ we see (see (|4.31[) and (I4.34[) below as well as Lemma 5.1 from 
[12]) that J2i<i< n <t a; ™ * s °f or der 0(t), and so for m < fi(i), 

(4.20) 25( £ ^(j)) 2 < E ^fe(0) + 2 £ a ^^ Ct 

l<j<m l<l<t \<l<n<t 

for some C > independent of i. Combining (|3~14l) . (EHO]) . (EHoT) . ([3T2^]) and 
(|4~T8]) - (|430l) we obtain that 

(4.21) Jii(^(i)) < Ct 1 - 6 

for some C > independent of t where e = (t — 26*)/ (1 + r). 
In order to estimate ^(t) we set 

y, 2 (j) - EV*(j) if |V?(j) - < 
otherwise 



where a € [t, j] will be further specified later on. Observe that 

(4.22) P{U t {j) £ V?{j) - EV 2 (j)} = P{\V 2 {j) - EV?{j)\ > J 1+a } 

< 2 1 +lj-( 1 +' T )( 1 +l) < 2 1+ i j- { - 1+ i^ 1+a - 2T \ 

Since a > 2r then the power of j in the right hand side of (|4.22[) is less than 
— 1, and so by the Borel-Cantelli lemma with probability one the event {Ui(j) ^ 
V 2 {j) — EV 2 (j)} can occur only finite number of times. Hence, the asymptotical 
behaviot as m — > oo of him) and of J^m) = | X)i<j<m ^» 0')l ^ s * ne same (up 
to a random variable independent of m) and it suffices to estimate the latter. Set 
U*(j) = Ui(J) - EUi(j). Using (157TU]) we obtain that for j < f , 

(4.23) \EU:(j)U*(j')\<(jf) 1 ^w q , p (f+ E "O- 

j<m<j' 

Next, 
(4.24) 

E(U*(jjf < 2j^+^ 1 -^E\U*(j)\ 1 +^ < 32j ( - 1 +^ 1 ~^E\V l (j)\ 2+s < Cj 1+2r - £ 

for some C > independent of j where e — | — cr + 4^ — T(5 and we choose <r and r 
so small that e > (5/8. It follows from (|2~T7| . P~2"3"]l and P~24) that for some (7 > 
independent of n and m, 

n 

(4.25) £( E ^O')) 2 < C(n 2+2T - e - m 2+2T - £ ) 

j=m+l 

and applying again Theorem Al from P3] we obtain by (|3.14[) and (I4.25|) that 

(4.26) | E ^(i)! « (^W) 1+T "^ £ < 2i 1 -3t*7 a.s. 

l<j< Vi (t) 

Hence, ^(^(i)) <C t ~ 2 < 1 + T ) well. 

Finally, it remains to establish (|4.16j) . In fact, existence of the limit 

t^OO Z » 

Kn<t 
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and its computation is given in Propositions 4.1 and 4.5 from |12j and we only have 
to explain the estimate (|4.16j) which is actually hidden inside the proof there. If 
i < k then the above limit has the form (see Proposition 4.1 in [T2]). 
(4.27) 



tii(i) with a,i(l) = J F i (xi,...,Xi)F i (yi,...,y i ) JJ dfi ul (x u ,y u ) 



/ — -co * 1< U < 

where \i n is the same as in (|2.4|) and djUo(x, y) = 5 xy du(x) is the measure supported 
by the diagonal. If i > k then (see Proposition 4.5 in [12 ), 

(4.28) u\ = J F?(x 1 ,...,x i )du(x 1 )---d t i(x l ). 
We have 

(4.29) E{ YdMn))) 2 = ]T b ^ n ') 

l<n<t l<n,n'<t 

where 

h(n, n') = EFi (X( qi (n)), X( qi (n)))F t (X( qi (n')), X(q t (n'))) . 
If i < k then for each integer m we consider bi(n, n') with in — in' — im. 



\bi(n, n) — a,i(m)\ < Ci{vj q _ p {- max(n, n')) + Aj(- max(n, n')). 



Assume that \im\ < | max(n, n') then we can apply (13.31) of Proposition 13.1 

with Q = .F-co,(i-3/4)max(n,n')' ^ = max(n,n')>°°' ^ = ("^ ( n ) ' *" ' ^ (* — 

l)n;X(n'),...,X((i - l)n')) and 2 = (X(in), X(in')) which gives that 

\h{n,n') - J Si!i(A-(n),..., - l)n),x)F l (X(n'), 
...,X((i- l)n'),y)d(j, im (x,y)\ < Ci(zo qtP (\max(n,n')) + fi q (\ max(n, »'))) 

for some Ci > independent of n, n'. Repeating these estimates i times we obtain 
that 

[-max(n,n'))+ P q (- 

If \im\ > max(n, n'), say im > max(n, n'), then applying (|3.3[) of Proposition 
Owith = -7 r _ 00 , max ( m - ! ( l _i)„) + _i ; „, "H = ^(i-i/iejn.oo, V = (A(n),...,X((i - 
l)n); X(n'), X(in')) and Z = X(in) which yields that 

\bi(n,n')\ < C 2 (vj q jn/16) + I3 q (n/16)) 

for some C2 > independent of n. The same estimate holds true if im < —n/4 
with n' in place of n, and so we can replace above n by max(n,n'). Next, we 
want to show that a similar estimate holds true for Oj(m) when |im| > max(n, n'), 
assuming first that im > max(n, n'). Since for in — in' = im, 

Oi(m) = EFi(xi,...,Xi-i,X(in))Fi(yi,...,yi-i,X(in')) J| d^i um (x u ,y u ) 

J 1<U<*-1 

we can apply (031) of Proposition O with Q = F_ oom , + i_ n , H = J r (l -_i_ )nj00 , 
V = (x%, Xi-x;yi, yi-i,X(in')) and Z = X(m) which yields that 

k(m)| < C 3 (w g ,j,(n/16) + /3 ? (n/16)) 

for some C3 > independent of n. If im < —j then we obtain a similar estimate 
with n replaced by n', and so we can replace n in the above estimate by max(n, n'). 



1(5 



Yu.Kifor 



Collecting the above estimates we obtain that if i < k and in — in' = im for an 
integer m then 

(4.30) \h(n, n') - Oi(m)\ < C 4 (w q ^ p (^ max(n, n')) + P q (j^ max(n, n')) 

for some C4 > independent of n. By (|2.17p . (|2.18p and (|4.30l) we obtain that for 

i < k, 

(4.31) I bii^n^-afl^Cst 1 - 6 

l<n,n'<t 

for some C5 > independent of t. 

Next, we consider the case i > k + 1. It follows from (|2.8I) that if n 7^ »' and 
max(n, nf) is large enough then \qi(n) — — (max(n, n')) s . Hence, relying on 

(I3.3P in Proposition 13.11 it is easy to see similarly to above that in this case 

(4.32) |6i(n,n')| < C 6 (zJ q , P (\h(n) - <fc(n')|) + 9 (~|«<(n) - <fc(n')|)) 

for some Cq > independent of n and n' . In order to estimate the difference 
between bi(n,n) and af from (|4.28l) we use that qi(n) — gi_i(n) > n for large n 
which yields that 

\bi(n,n)- J EF?(X(n),...,X((i-l)n),x)du(x)\<C 7 (w q , p (^n) + P q (^n)) 

for some CV > independent of n where we rely on ()3.3|) from Proposition 13. II with 
Q = J r _ 00:(i _| ) „, "H = 7" ( i_i )n)0O , V = (X(n),...,X((i - l)n)) and Z = X(m). 
Repeating this estimate i times we obtain that 

(4.33) I J2 h(n,n)-ta?\<C 8 ]T K, p (in) + /3 9 (~n)) 

0<n<t 0<n<t 

for some C 8 > independent of t. This together with ((2~5|) . (|2~T7)l . (|2~TS)) and 
P~52l yields that 

(4.34) I b^n^-tafl^Cst 1 - 6 

0<n,n'<t 

for some C 9 > independent of i. Finally, (|4~2^]) . (|4~5T|) and (|4~H1) yields (|4TTB|) 
completing the proof of Theorem 12.21 □ 
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